Ondřej Čertík
Los Alamos National Laboratory
Abstract: In this keynote we show how Python is used in scientific computing, its relevance and importance from a variety of standpoints. We motivate the use of Python and its applications to visualization (Matplotlib, VTK, Mayavi), symbolic computing (SymPy) and numerical computing. For numerical, array oriented algorithms, it is hard to beat modern Fortran in terms of simplicity and speed, so we compare NumPy/SciPy side by side with Fortran to show that the syntax is almost identical and thus very natural to go back and forth. We show how to wrap Fortran and C++ code into Python. We provide examples how these tools are used in warm dense matter and other research at LANL. Finally, we show how all of the above tasks can be conveniently executed from the IPython Notebook, an interactive environment for scientific computing.
%pylab inline
Populating the interactive namespace from numpy and matplotlib
Source: Fernando Pérez
y = sin(k*t)
.Source: Fernando Pérez
Source: Fernando Pérez
Computing in science must improve drastically before we can really call it scientific.
Source: Fernando Pérez
Barriers and discontinuities in workflow in between all the steps
Source: Fernando Pérez
Source: Fernando Pérez
Source: Brian Granger
Simple Matplotlib Demo
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
def plotit(amplitude, color):
fig, ax = plt.subplots(figsize=(4, 3),
subplot_kw={'axisbg':'#EEEEEE',
'axisbelow':True})
ax.grid(color='w', linewidth=2, linestyle='solid')
x = np.linspace(0, 10, 1000)
ax.plot(x, amplitude * np.sin(x), color=color,
lw=5, alpha=0.4)
ax.set_xlim(0, 10)
ax.set_ylim(-1.1, 1.1)
return fig
from ipywidgets import StaticInteract, RangeWidget, RadioWidget
StaticInteract(plotit,
amplitude=RangeWidget(0.1, 1.0, 0.1),
color=RadioWidget(['blue', 'green', 'red']))
def make_plot(Z, R, orbitals, density):
fig, (ax, ax2, ax3) = plt.subplots(1, 3, figsize=(12.5, 3),
subplot_kw={'axisbg':'#EEEEEE',
'axisbelow':True})
ax.grid(color='w', linewidth=2, linestyle='solid')
for i in range(size(orbitals, 1)):
ax.plot(R, orbitals[:, i], lw=5, alpha=0.4)
ax.set_xlim(0, 4)
ax.set_ylim(-10, 10)
ax.set_title("Atomic orbitals (Z=%d)" % Z)
ax.set_xlabel("$R$ [a.u.]")
ax.set_ylabel("Orbitals [a.u.]")
ax2.grid(color='w', linewidth=2, linestyle='solid')
ax2.plot(R, R**2*density, lw=5, alpha=0.4)
ax2.set_xlim(0, 4)
ax2.set_ylim(0, 5)
ax2.set_title("Electronic charge density (Z=%d)" % Z)
ax2.set_xlabel("$R$ [a.u.]")
ax2.set_ylabel("$n(r) r^2$ [a.u.]")
ax3.grid(color='w', linewidth=2, linestyle='solid')
ax3.semilogx(R, R**2*density, lw=5, alpha=0.4)
ax3.set_xlim(1e-4, 40)
ax3.set_ylim(0, 5)
ax3.set_title("Electronic charge density (Z=%d)" % Z)
ax3.set_xlabel("$R$ [a.u.] (log scale)")
ax3.set_ylabel("$n(r) r^2$ [a.u.]")
return fig
from dftatom import atom_lda
from numpy import size
def plotit(Z):
a = 2.7e6
rmin = 1e-7
rmax = 50
N = 5500
E_tot, ks_energies, n, l, f, R, Rp, V_tot, density, orbitals = \
atom_lda(Z, rmin, rmax, a, N, 1e-11, 100, 1e-10, 0.35, 100, True)
return make_plot(Z, R, orbitals, density)
StaticInteract(plotit, Z=RangeWidget(1, 40))
Source: Fernando Pérez
from sympy import init_printing, var, sqrt, factor, integrate, erf
init_printing(use_latex=True)
var("x")
e = factor(x**36-1)
e
e.expand()
erf(x).series(x, 0, 6)
integrate(1/(x**3+1), x)
Chris Smith, Aaron Meurer, Mateusz Paprocki, Ondřej Čertík, Matthew Rocklin, Julien Rioux, Ronan Lamy, Raoul Bourquin, Sergey B Kirpichev, Kirill Smelkov, Øyvind Jensen, Tom Bachmann, Mario Pernici, Sergiu Ivanov, Saptarshi Mandal, Stefan Krastanov, Thilina Rathnayake, Rick Muller, Sean Vig, Brian E. Granger, David Li, Vinzent Steinberg, Vladimir Perić, Gilbert Gede, Raymond Wong, Fredrik Johansson, Fabian Pedregosa, Bharath M R, Addison Cugini, Thomas Hisch, Guru Devanla, Manoj Kumar, Alexey U. Gudchenko, Sachin Joglekar, hm, Priit Laes, Prasoon Shukla, Matt Habel, Alan Bromborsky, Timothy Reluga, Tomo Lazovich, Matt Curry, Mary Clark, Pablo Puente, Jason Gedge, Christopher Dembia, Aleksandar Makelov, Katja Sophie Hotz, Brian Jorgensen, Kendhia, Francesco Bonazzi, Ramana Venkata, Andy R. Terrel, Joachim Durchholz, Grzegorz Świrski, Pearu Peterson, Sebastian Krämer, Joan Creus, Siddhanathan Shanmugam, Toon Verstraelen, Cristóvão Sousa, Christian Muise, Jorn Baayen, Jeremias Yehdegho Alexander Hirzel, Kevin Hunter, Matthew Hoff, Riccardo Gori, Steve Anton, Sanket Agarwal, Jason Moore, Robert Schwarz, David Ju, Angadh Nanjangud, Luke Peterson, Oliver Lee, Renato Coutinho, Yuriy Demidov, Bilal Akhtar, Stepan Roucka, Chetna Gupta, Miha Marolt, Stephen Loo, Nathan Alison, Niklas Thörne, Saurabh Jha, jerryma1121, Brian Stephanik, Robert Kern, Sachin Irukula, Sam Sleight, Angus Griffith, Patrick Lacasse, Swapnil Agarwal, Gary Kerr, Mike Boyle, Natalia Nawara, Nicolas Pourcelot, Sherjil Ozair, Ankit Agrawal, Huijun Mai, Jim Zhang, Ljubiša Moćić, Prafullkumar P. Tale, Marek Šuppa, Freddie Witherden, Roberto Nobrega, David Joyner, Felix Kaiser, Friedrich Hagedorn, Randy Heydon, Saroj Adhikari, Sean Ge, Alexey Subach, Alkiviadis G. Akritas, CJ Carey, Eric Nelson, Jaroslaw Tworek, Yuri Karadzhov, Christian Bühler, Rishabh Dixit, Ryan Krauss, Amit Saha, Ananya H, Andreas Kloeckner, Demian Wassermann, Mark Dewing, Min Ragan-Kelley, Raphael Michel, Sam Magura, Tim Swast, Chancellor Arkantos, Chris Wu, Christophe Saint-Jean, Davy Mao, Harold Erbin, Khagesh Patel, Manish Gill, Matthew Brett, Nichita Utiu, Piotr Korgul, Roland Puntaier, Tarun Gaba, Tobias Lenz, Tomasz Buchert, Abderrahim Kitouni
Alexandr Popov, Chris Conley, David Roberts, Florian Mickler, Harsh Gupta, Imran Ahmed Manzoor, Jochen Voss, Nimish Telang, Rom le Clair, Sebastian Kreft, Stefan van der Walt, Stefano Maggiolo, Stefen Yin, Tiffany Zhu, Tristan Hume, Varun Joshi, Óscar Nájera, Akshay Srinivasan, Akshit Agarwal, Amit Jamadagni, Andrew Straw, Barry Wardell, Benjamin McDonald, Bill Flynn, Case Van Horsen, Emma Hogan, Geoffry Song, George Waksman, Heiner Kirchhoffer, Jens H. Nielsen, Julio Idichekop Filho, Luca Weihs, Luis Garcia, Manoj Babu K., Martin Povišer, Nikolay Lazarov, Pan Peng, Raffaele De Feo, Shravas K Rao, Ted Horst, Arpit Goyal, Ashwini Oruganti, Ben Goodrich, Boris Timokhin, Bradley Froehle, Colleen Lee, David Marek, Dmitry Batkovich, Fernando Perez, Goutham Lakshminarayan, Henrik Johansson, Jack McCaffery, James Aspnes, James Fiedler, Jezreel Ng, Jurjen N.E. Bos, Michael Mayorov, Nikhil Sarda, Oleksandr Gituliar, Oscar Benjamin, Pavel Fedotov, Pradyumna, QuaBoo, Sai Nikhil, Thomas Dixon, Thomas Wiecki, Tomáš Bambas, Tuomas Airaksinen, rathmann, tsmars15, Acebulf, Alexander Eberspächer, Alexandr Gudulin, Ali Raza Syed, Anatolii Koval, Andre de Fortier Smit, Andrej Tokarčík, Andrew Docherty, Bastian Weber, Benjamin Fishbein, Bernhard R. Link, Björn Dahlgren, Carsten Knoll, Christian Schubert, Comer Duncan, David Lawrence, Eh Tan, Elrond der Elbenfuerst, Erik Welch, Gert-Ludwig Ingold, Gregory Ksionda, Hubert Tsang, James Abbatiello, James Goppert, James Pearson, Jeremy, Johann Cohen-Tanugi, Jorge E. Cardona, Joseph Dougherty, Kaifeng Zhu, Kazuo Thow, Kevin Goodsell, Kibeom Kim, Konrad Meyer, Lars Buitinck, Madeleine Ball, Marcin Kostrzewa, Markus Müller, Matt Rajca, Matthew Tadd, Matthias Toews, Max Hutchinson, Nicholas J.S. Kinar, Or Dvory, Paul Strickland, Pauli Virtanen, Prateek Papriwal, Puneeth Chaganti, Rizgar Mella, Robert, Robert Cimrman, Roberto Colistete, Jr., Sebastian Krause, Seshagiri Prabhu, Shai 'Deshe' Wyborski, Shruti Mangipudi, Siddhant Jain, Srinivas Vasudevan, Stepan Simsa, Takafumi Arakaki, Tarang Patel, Thomas Sidoti, Tim Lahey, Tyler Pirtle, Vasily Povalyaev, Vinay Kumar, Vinit Ravishankar, Vladimir Lagunov, marshall2389, vishal, Łukasz Pankowski
def get_data(filename):
data = array([int(l.split()[0]) for l in open("data/" + filename).readlines()])
return data
for project in ["sympy", "ipython", "numpy", "matplotlib", "sklearn", "pandas", "scipy"]:
data = get_data("%s-year.txt" % project)
plot(data, lw=2, label="%s last year" % project)
axhline(50, lw=1, color='k', linestyle='--')
legend()
grid()
xlabel("Individual committer")
ylabel("# of commits")
xlim([0, 25]);
#ylim([0, 1]);
savefig("plots/commits1.png")
for project in ["sympy", "ipython", "numpy", "matplotlib", "sklearn", "pandas", "scipy", "linux"]:
data = get_data("%s-year.txt" % project)
plot(data, lw=2, label="%s last year" % project)
axhline(50, lw=1, color='k', linestyle='--')
legend()
grid()
xlabel("Individual committer")
ylabel("# of commits")
xlim([0, 25]);
ylim([0, 1200]);
savefig("plots/commits1b.png")
for project in ["sympy", "ipython", "numpy", "matplotlib", 'sklearn', 'pandas', 'scipy']:
data = get_data("%s-all.txt" % project)
#data = data / float(sum(data))
#data = data / float(data[0])
data = np.append(data, [0.55]*(300 - len(data)))
semilogy(data, lw=2, label="%s all" % project)
legend()
grid()
xlabel("individual people")
ylabel("total number of patches")
xlim([0, 300]);
ylim([0.6, 1e4]);
savefig("plots/commits-all.png")
for project in ["sympy", "ipython", "numpy", "matplotlib", 'sklearn', 'pandas', 'scipy', "linux"]:
data = get_data("%s-all.txt" % project)
#data = data / float(sum(data))
#data = data / float(data[0])
data = np.append(data, [0.55]*(300 - len(data)))
semilogy(data, lw=2, label="%s all" % project)
legend()
grid()
xlabel("individual people")
ylabel("total number of patches")
xlim([0, 300]);
ylim([0.6, 1e4]);
savefig("plots/commits-all-b.png")
Last year:
Last year:
All time:
All time:
Domain specific language for numeric scientific computing
Not so good for other things
Python:
from numpy import array, size, shape, min, max, sum
a = array([1, 2, 3])
print shape(a)
print size(a)
print max(a)
print min(a)
print sum(a)
Fortran:
integer :: a(3)
a = [1, 2, 3]
print *, shape(a)
print *, size(a)
print *, maxval(a)
print *, minval(a)
print *, sum(a)
Python:
from numpy import array, dot
a = array([[1, 2], [3, 4]])
b = array([[2, 3], [4, 5]])
print a * b
print dot(a, b)
Fortran
integer :: a(2, 2), b(2, 2)
a = reshape([1, 2, 3, 4], [2, 2], order=[2, 1])
b = reshape([2, 3, 4, 5], [2, 2], order=[2, 1])
print *, a * b
print *, matmul(a, b)
Python
from numpy import reshape
a = reshape([1, 2, 3, 4, 5, 6], (2, 3))
b = reshape([1, 2, 3, 4, 5, 6], (2, 3), order="F")
print a[0, :]
print a[1, :]
print
print b[0, :]
print b[1, :]
Fortran
integer :: a(2, 3), b(2, 3)
a = reshape([1, 2, 3, 4, 5, 6], [2, 3], order=[2, 1])
b = reshape([1, 2, 3, 4, 5, 6], [2, 3])
print *, a(1, :)
print *, a(2, :)
print *
print *, b(1, :)
print *, b(2, :)
Guido van van Rossum (2007):
Python 3.0 will break backwards compatibility. Totally. We're not even aiming for a specific common subset.
Initial recommended way:
Problems:
Travis-CI
)
cmake
, make
, ...), it is not robust
setup.py
/distutilsThe recommended way now:
compatibility.py
file (i.e. from compatibility import range
)Conclusion:
Fernando Pérez's words: